deep learning best practice
Deep Learning Best Practices: Activation Functions & Weight Initialization Methods -- Part 1
One of the reasons that Deep learning has become more popular in the past decade is better learning algorithms which have to lead to faster convergence or better performance of neural networks in general. Along with better learning algorithms, Introduction of better activation functions, and better initialization methods help us to create better neural networks. Note: This article assumes that the reader has a basic understanding of Neural Network, weights, biases, and backpropagation. In this article, we discuss some of the commonly used activation functions and weight initialization methods while training a deep neural network. To be more specific, we will be covering the following.
Deep Learning Best Practices (1) -- Weight Initialization
As a beginner at deep learning, one of the things I realized is that there isn't much online documentation that covers all the deep learning tricks in one place. There are lots of small best practices, ranging from simple tricks like initializing weights, regularization to slightly complex techniques like cyclic learning rates that can make training and debugging neural nets easier and efficient. This inspired me to write this series of blogs where I will cover as many nuances as I can to make implementing deep learning simpler for you. While writing this blog, the assumption is that you have a basic idea of how neural networks are trained. An understanding of weights, biases, hidden layers, activations and activation functions will make the content clearer.
8 Deep Learning Best Practices I Learned About in 2017
Something I was really happy about accomplishing in 2017 was getting more practically involved with modern AI. I've studied a lot of math, which has certainly been fun, but haven't done any practical projects, and therefore have nothing to show for my efforts. To remedy this, in April, I applied for an AI Grant with the aim of building FastText skip-gram models for Kenyan speech. I became a finalist in the first round, but failed to win a grant. Then, this September, I applied to the international fellowship track of a now-complete class on Practical Deep Learning for Coders, Part 1 v2, taught by Jeremy Howard of fast.ai.